285 research outputs found

    Approximate failure recovery in distributed graph processing systems

    Get PDF
    Distributed graph processing systems are an emerging area of big data systems. As graphs continue to grow in size and prevalence, these systems must become faster and more scalable. However, after failures, distributed graph processing systems either largely rely on proactive fault tolerance techniques such as checkpointing, or use no fault tolerance mechanisms at all and simply restart computation. The former approach entails significant proactive overheads that increase with the size of the graph, while the latter wastes time and resources in potentially lengthy recomputation. In this thesis, we argue that distributed graph processing systems should instead use a approximate approach to failure recovery that trades off minimal amounts of application accuracy while reducing the overhead during failure-free execution to zero, and allowing fast and scalable recovery. We build a system called Zorro that imbues the approximate reactive approach, and integrate Zorro into two distributed graph processing systems -- PowerGraph and LFGraph. When a failure occurs, Zorro opportunistically exploits vertex replication (inherent in today's graph processing systems) to quickly and scalably rebuild the state of failed servers. In addition, we describe three other novel failure recovery mechanisms that aim to address several of Zorro's shortcomings. The first utilizes optimistic accuracy results from graph sampling and hence continues after failure without taking any action. The second repartitions the graph after failure to avoid waiting for replacement servers, and then continues computation with the recovered state. The last allows a small amount of proactive overhead to significantly increase the fraction of recovered state. Experiments using five real-world graphs and eight benchmark applications demonstrate that Zorro is able to recover over 99% of the graph state when a few servers fail, and between 87-92% when half the cluster fails, with recovery taking only a fraction of the cost of a single iteration. Furthermore, using eight common graph processing algorithms, Zorro incurs little to no accuracy loss in all experimental failure scenarios. Furthermore, preliminary analysis and experiments using our three alternative approaches suggest that they are able to address many of the potential issues Zorro faces with minimal overhead and accuracy loss

    An Experimental Comparison of Partitioning Strategies in Distributed Graph Processing

    Get PDF
    In this paper, we study the problem of choosing among partitioning strategies in distributed graph processing systems.To this end, we evaluate and characterize both the performance and resource usage of different partitioning strategies under various popular distributed graph processing systems, applications, input graphs, and execution environments. Through our experiments, we found that no single partitioning strategy is the best fit for all situations, and that the choice of partitioning strategy has a significant effect on resource usage and application run-time. Our experiments demonstrate that the choice of partitioning strategy depends on (1) the degree distribution of input graph, (2) the type and duration of the application, and (3) the cluster size. Based on our results, we present rules of thumb to help users pick the best partitioning strategy for their particular use cases. We present results from each system, as well as from all partitioning strategies implemented in one common system (PowerLyra).Ope

    Learning probabilistic relational planning rules

    Get PDF
    To learn to behave in highly complex domains, agents must represent and learn compact models of the world dynamics. In this paper, we present an algorithm for learning probabilistic STRIPS-like planning operators from examples. We demonstrate the effective learning of rule-based operators for a wide range of traditional planning domains

    Probability of Detection Study for Visual Inspection of Steel Bridges: Volume 2—Full Project Report

    Get PDF
    An inspector’s ability to correctly identify surface and internal defects in steel bridge components is critical to protecting public safety. Ensuring that inspectors are properly trained and adequately equipped to detect these defects in locations that are difficult to access and/or in unfavorable environmental conditions must be a high priority. While the Federal Highway Administration and individual state departments of transportation have guidelines for inspector qualifications, trainings, and certifications, there is very little emphasis placed on evaluating or “testing: a given inspector’s capability to characterize defects in the field. As a result, there is also very little, if any, data on how well a given inspector actually performs or the variability which can be expected between inspectors. This comprehensive Probability of Detection (POD) study was conducted to establish the ability of an inspector with the current required training to locate and quantify cracks in steel bridge components. This study is believed to be the first statistically-significant study of its kind in the United States

    Probability of Detection Study for Visual Inspection of Steel Bridges: Volume 1—Executive Summary

    Get PDF
    An inspector’s ability to correctly identify surface and internal defects in steel bridge components is critical to protecting public safety. Ensuring that inspectors are properly trained and adequately equipped to detect these defects in locations that are difficult to access and/or in unfavorable environmental conditions must be a high priority. While the Federal Highway Administration and individual state departments of transportation have guidelines for inspector qualifications, trainings, and certifications, there is very little emphasis placed on evaluating or “testing” a given inspector’s capability to characterize detect defects in the field. As a result, there is also very little if any data on how well a given inspector actually performs or the variability which can be expected between various inspectors. This comprehensive Probability of Detection (POD) study was conducted to establish the ability of an inspector with the current required training to locate cracks in steel bridge components using typical visual inspection techniques

    Zorro: Zero-Cost Reactive Failure Recovery in Distributed Graph Processing

    Get PDF
    Distributed graph processing systems largely rely on proactive techniques for failure recovery. Unfortunately, these approaches (such as checkpointing) entail a significant overhead. In this paper, we argue that distributed graph processing systems should instead use a reactive approach to failure recovery. The reactive approach trades off completeness of the result (generating a slightly inaccurate result) while reducing the overhead during failure-free execution to zero. We build a system called Zorro that imbues this reactive approach, and integrate Zorro into two graph processing systems – PowerGraph and LFGraph. When a failure occurs, Zorro opportunistically exploits vertex replication (inherent in today’s graph processing systems) to quickly rebuild the state of failed servers. Experiments using real-world graphs demonstrate that Zorro is able to recover over 99% of the graph state when a few servers fail, and between 87-92% when half the cluster fails. Furthermore, using eight common graph processing algorithms, Zorro incurs little to no accuracy loss in all experimental failure scenarios.Ope

    Toward an Understanding of the Role of the Environment in the Development of Early Callous Behavior

    Full text link
    Key to understanding the long‐term impact of social inequalities is identifying early behaviors that may signal higher risk for later poor psychosocial outcomes, such as psychopathology. A set of early‐emerging characteristics that may signal risk for later externalizing psychopathology is callous‐unemotional (CU) behavior. CU behavior predicts severe and chronic trajectories of externalizing behaviors in youth. However, much research on CU behavior has focused on late childhood and adolescence, with little attention paid to early childhood when preventative interventions may be most effective. In this article, we summarize our recent work showing that (a) CU behavior can be identified in early childhood using items from common behavior checklists, (b) CU behavior predicts worse outcomes across early childhood, (c) CU behavior exhibits a nomological network distinct from other early externalizing behaviors, and (d) malleable environmental factors, particularly parenting, may play a role in the development of early CU behaviors. We discuss the challenges of studying contextual contributors to the development of CU behavior in terms of gene–environment correlations and present initial results from work examining CU behavior in an adoption study in which gene–environment correlations are examined in early childhood. We find that parenting is a predictor of early CU behavior even in a sample in which parents are not genetically related to the children.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/136006/1/jopy12221_am.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/136006/2/jopy12221.pd

    On Secure Workflow Decentralisation on the Internet

    Get PDF
    Decentralised workflow management systems are a new research area, where most work to-date has focused on the system's overall architecture. As little attention has been given to the security aspects in such systems, we follow a security driven approach, and consider, from the perspective of available security building blocks, how security can be implemented and what new opportunities are presented when empowering the decentralised environment with modern distributed security protocols. Our research is motivated by a more general question of how to combine the positive enablers that email exchange enjoys, with the general benefits of workflow systems, and more specifically with the benefits that can be introduced in a decentralised environment. This aims to equip email users with a set of tools to manage the semantics of a message exchange, contents, participants and their roles in the exchange in an environment that provides inherent assurances of security and privacy. This work is based on a survey of contemporary distributed security protocols, and considers how these protocols could be used in implementing a distributed workflow management system with decentralised control . We review a set of these protocols, focusing on the required message sequences in reviewing the protocols, and discuss how these security protocols provide the foundations for implementing core control-flow, data, and resource patterns in a distributed workflow environment

    The SAMI Galaxy Survey : spatially resolving the main sequence of star formation

    Get PDF
    We present the ∼800 star formation rate maps for the Sydney-AAO Multi-object Integral field spectrograph (SAMI) Galaxy Survey based on H α emission maps, corrected for dust attenuation via the Balmer decrement, that are included in the SAMI Public Data Release 1. We mask out spaxels contaminated by non-stellar emission using the [O iii]/H β, [N ii]/H α, [S ii]/H α, and [O i]/H α line ratios. Using these maps, we examine the global and resolved star-forming main sequences of SAMI galaxies as a function of morphology, environmental density, and stellar mass. Galaxies further below the star-forming main sequence are more likely to have flatter star formation profiles. Early-type galaxies split into two populations with similar stellar masses and central stellar mass surface densities. The main-sequence population has centrally concentrated star formation similar to late-type galaxies, while galaxies >3σ below the main sequence show significantly reduced star formation most strikingly in the nuclear regions. The split populations support a two-step quenching mechanism, wherein halo mass first cuts off the gas supply and remaining gas continues to form stars until the local stellar mass surface density can stabilize the reduced remaining fuel against further star formation. Across all morphologies, galaxies in denser environments show a decreased specific star formation rate from the outside in, supporting an environmental cause for quenching, such as ram-pressure stripping or galaxy interactions.Publisher PDFPeer reviewe
    corecore